giCentre-Goodwin-MC1
VAST 2012 Challenge
Mini-Challenge 1: Bank of Money Enterprise: Cyber Situation Awareness
Team Members:
Alexander Kachkaev, giCentre, City University London, alexander.kachkaev.1@city.ac.uk
Iain Dillingham, giCentre, City University London, iain.dillingham.1@city.ac.uk
Roger Beecham, giCentre City University London roger.beecham.1@city.ac.uk
Sarah Goodwin, giCentre, City University London, sarah.goodwin.1@city.ac.uk PRIMARY
Nabiha Ahmed, giCentre, City University London, nabihakk@gmail.com
Aidan Slingsby, giCentre, City University London, a.slingsby@city.ac.uk
Student Team: YES
Tool(s):
BoM Network Status Application, developed by Alexander Kachkaev, giCentre, City University London
Google Earth
Video:
Answers to Mini-Challenge 1 Questions:
MC 1.1 Create a visualization of the health and policy status of the entire Bank of Money enterprise as of 2 pm BMT (BankWorld Mean Time) on February 2. What areas of concern do you observe?
Figure 1. Snapshot view: Facilities arranged geographically. Branches and regional headquarters ordered by latitude. Machines ordered by policyStatus
. Hue represents policyStatus
(grey: not reporting). The "details-on-demand" panel (DoD) shows summary information about machines in datacenter-5.
Most machines in region-5 and region-10 report moderate policy deviations (Figure 1, olive green cells). This is clearly not the case in other regions (bright green cells). Have these regions been compromised?
Figure 2. As Figure 1. However, filtered to show workstations. 43% of workstations in region-34 headquarters are reporting (DoD top).
Between 50% and 70% of workstations in the westerly regions, where the time is 6am local or earlier, are reporting (Figure 2 left, non-grey cells). This is not consistent with business rules.
95% of servers in datacenter-5 are not reporting (Figure 1, DoD top). This is consistent across all server types (DoD bottom). Of servers that are reporting, 186 report moderate or serious policy deviations (DoD top). Has this datacenter been compromised?
None of the machines in the 15 southernmost facilities in region-25 are reporting (Figure 1 right, grey cells). Has a natural phenomenon affected operations?
Figure 3. Regions arranged geographically; branches ordered by latitude; regional headquarters aligned to cell bottom. Main headquarters and datacenters aligned to screen bottom. Machines ordered by activityFlag
. Hue represents activityFlag
. Filtered to show servers. Zoomed into datacenter-1 and datacenter-4.
Although servers in datacenter-1 and datacenter-4 have a normal number of connections, large numbers of machines report suspicious activity. For example, 1,070 machines in datacenter-1 report 5+ invalid login attempts (Figure 3, DoD top); many of these machines are compute servers (DoD bottom). Are these machines being targeted?
MC 1.2 Use your visualization tools to look at how the network's status changes over time. Highlight up to five potential anomalies in the network and provide a visualization of each. When did each anomaly begin and end? What might be an explanation of each anomaly?
Figure 4. Temporal overview: Facilities are arranged geographically. Time (BMT) is on the horizontal axis. Proportion of machines in each policyStatus
category is on the vertical axis (grey: not reporting).
Figure 5. Temporal view: Facilities arranged geographically. Branches and regional headquarters ordered by latitude. Time (local) is on the horizontal axis. Brightness represents the proportion of machines reporting serious policy deviations (25% ceiling).
Network status clearly deteriorates over time and especially after 07:00 local on February 3.
We began by exploring a temporal (BMT) overview (Figure 4). This overview highlights that the proportion of machines reporting normal status (bright green) decreases over time, whilst the proportion reporting policy deviations and potential virus infections increases. It also highlights the expected diurnal pattern, whereby machines are turned off (and not reporting) outside of business hours (see above).
We then used a temporal (local) view to explore network status in more detail, filtering by policyStatus
and activityFlag
categories (Figure 5). In each case, we modified the mapping between colour range and data range to highlight the small proportion of machines that report either abnormal status or abnormal activity. This view highlights that network status deteriorates noticeably after 07:00 local on February 3—notice the vertical banding in Figure 5.
This anomaly might be explained by an attack on the network. Malicious software on compromised machines could have been configured to activate at similar local times. This could explain why the expected diurnal pattern in terms of number of connections is almost unbroken (see below), whilst the network status clearly deteriorates: the network has been targeted, machines have been compromised; and the malicious software has been activated.
Figure 6. As Figure 5 but brightness represents the maximum number of connections. For example, in region-10, branch-40 the maximum number of connections at 02:15 local on February 3 is 100; the mean is 17 (DoD top).
Many machines in region-10 have a large number of connections from 02:15 local to 05:15 local on February 3.
We used a temporal (local) view to explore the maximum number of connections. This view highlights that machines in region-10 have a large number of connections in this time period—notice how region-10 has a vertical band in addition to the expected diurnal pattern (Figure 6).
This anomaly might be explained by an attack on the network. Although we cannot determine the nature of these connections, machines in region-10 could be connecting to machines in other regions. Consequently, we speculate that many machines in region-10 were compromised in the early hours of February 3. These machines then connected to machines in other regions to spread malicious software; and this sequence caused the deterioration in network status (see above).
Figure 7. As Figure 5. However, brightness represents proportion of machines not reporting (white: all reporting; black none reporting).
Figure 8. As Figure 7 but zoomed into region-25. All machines have started reporting by 01:00 local (DoD top).
There is a clear spatio-temporal trend in the southernmost facilities in region-25. Moving from south to north, machines stop reporting from 09:15 local on February 2 to 01:00 local on February 3.
We used a temporal (local) view to explore where and when machines are not reporting (Figure 7); region-25 is a salient case, as is datacenter-5 (see below). By zooming into region-25, selecting branches and stepping forward and backward in time we were able to identify that machines branch-33, branch-48 and branch-39 are the first to stop reporting at 09:15 local (branch-33 and branch-39) and 09:45 local (branch-48) on February 2; these are followed by 13 branches to their north at 10:15 local and so on until branch-49 and branch-17 to their north at 15:15 local. All branches have started reporting by 01:00 local on February 3 (Figure 8).
This anomaly supports our hypothesis that a natural phenomenon has affected operations (see above). We inspected the geography of Bank World using Google Earth and found that the affected region—Atta—is a mid-latitude, coastal region. Consequently, we speculate that an extratropical cyclone has caused a power-outage (although Bank World's Pangaea-like geography makes this speculation tentative).
Figure 9. As Figure 7 but zoomed into datacenter-5. Only 243 of 51,087 machines were reporting at 05:15 local on February 2 (DoD top).
From 05:15 local to 13:00 local on February 2 almost all machines in datacenter-5 are not reporting.
Again, we zoomed into a temporal view (local) to explore when machines in datacenter-5 are not reporting (see above).
This anomaly might be explained by routine maintenance, although we would expect this work to occur overnight, rather than throughout the morning: without data from before 05:15 local (11:15 BMT) we cannot be sure. Nevertheless, it would seem that datacenter-5 fares as well as other regions in terms of network status (Figure 1), which suggests that many of it's machines have also been compromised (see above).